Conversation
…ação do markup_doc e atribuições realizado nas tarefas.
…á necessário um modelo llama.
There was a problem hiding this comment.
Pull request overview
This PR addresses installation and execution issues for markapi by consolidating llama dependencies into base requirements and making the llama model mandatory for document markup operations. The changes remove duplicate Docker configurations and add test fixtures for validation.
Key changes:
- Moved llama-related dependencies from optional to required base dependencies
- Changed LLAMA_ENABLED default to True, making llama model mandatory
- Renamed database field from
statustoestatusin Reference model - Removed duplicate Docker configurations (llama.local.yml and Dockerfile.llama)
- Added test fixture DOCX files for document processing validation
Reviewed changes
Copilot reviewed 7 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements/base.txt | Consolidated llama dependencies (huggingface-hub, llama-cpp-python) into base requirements |
| requirements/extra-llama.txt | Removed file as dependencies moved to base |
| config/settings/base.py | Changed LLAMA_ENABLED default from False to True |
| reference/models.py | Renamed field from status to estatus |
| reference/wagtail_hooks.py | Updated field reference to match rename |
| llama.local.yml | Removed duplicate Docker Compose configuration |
| compose/local/django/Dockerfile.llama | Removed specialized Dockerfile |
| fixtures/e740.docx | Added test fixture (binary file) |
| fixtures/e14790.docx | Added test fixture (binary file) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| new_reference = Reference.objects.create( | ||
| mixed_citation=linea, | ||
| status=1, # Estatus predeterminado | ||
| estatus=1, # Estatus predeterminado |
There was a problem hiding this comment.
This field rename from status=1 to estatus=1 is a breaking change that requires a database migration. Ensure that a migration file has been created and tested to handle this field rename, otherwise existing database records will be inaccessible.
|
|
||
| # LLAMA | ||
| LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=False) | ||
| LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=True) |
There was a problem hiding this comment.
Changing LLAMA_ENABLED default from False to True makes the llama model mandatory for all deployments. This is a significant behavioral change that:
- Requires the llama model file to be present in all environments
- May impact existing deployments that don't have llama configured
- Should be clearly documented in deployment/upgrade instructions
Consider whether this should be an opt-in change rather than changing the default, or ensure proper deployment documentation is updated.
| LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=True) | |
| LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=False) |
| mixed_citation = models.TextField(_("Mixed Citation"), null=False, blank=True) | ||
|
|
||
| status = models.IntegerField( | ||
| estatus = models.IntegerField( |
There was a problem hiding this comment.
The field name change from status to estatus appears inconsistent with the model name ReferenceStatus and the translation _("Reference status"). This creates confusion as the field name suggests Spanish while the related choices class and translation key use English. Consider either:
- Keeping the field name as
statusto match the choices class name - Or changing both to use consistent Spanish naming (e.g.,
ReferenceEstatuschoices class)
| estatus = models.IntegerField( | |
| status = models.IntegerField( |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 10 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
reference/models.py:29
- Renaming
Reference.statustoReference.estatusrequires a schema migration (ideally a RenameField migration to preserve data). Right now existing migrations and code still referencestatus(e.g.,reference/migrations/0001_initial.py,reference/api/v1/views.py,reference/data_utils.py), which will break at runtime after migrating. Add/update migrations and update all remainingstatusreferences toestatus(or keep the field name asstatusif the rename is not intended).
class Reference(CommonControlField, ClusterableModel):
mixed_citation = models.TextField(_("Mixed Citation"), null=False, blank=True)
estatus = models.IntegerField(
_("Reference status"),
choices=ReferenceStatus.choices,
blank=True,
default=ReferenceStatus.NO_REFERENCE
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # huggingface-hub | ||
| # ------------------------------------------------------------------------------ | ||
| huggingface_hub==0.26.1 # https://pypi.org/project/huggingface-hub/ | ||
|
|
||
| # Python Bindings for llama.cpp | ||
| # ------------------------------------------------------------------------------ | ||
| llama-cpp-python==0.3.14 # https://llama-cpp-python.readthedocs.io/en/latest/ |
There was a problem hiding this comment.
huggingface_hub is not the pip package name (PyPI uses huggingface-hub). As written, installs will fail with “No matching distribution found”. Rename the requirement entry back to huggingface-hub==0.26.1 (the import module is huggingface_hub, but the distribution name uses a hyphen).
| new_reference = Reference.objects.create( | ||
| mixed_citation=linea, | ||
| status=1, # Estatus predeterminado | ||
| estatus=1, # Estatus predeterminado | ||
| creator=self.request.user, # Usuario asociado | ||
| ) |
There was a problem hiding this comment.
This creates references with a hard-coded estatus=1. Since ReferenceStatus exists and is used elsewhere, prefer using the enum value (e.g., ReferenceStatus.CREATING) to avoid magic numbers and keep behavior stable if the enum changes.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 10 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
reference/models.py:29
- O campo do modelo foi renomeado de
statusparaestatus, mas as migrações existentes (ex.:reference/migrations/0001_initial.py) ainda criam a colunastatuse há usos do atributostatusem outras partes do app (ex.:reference/data_utils.py,reference/api/v1/views.py). Isso vai quebrar em runtime (AttributeError) e/ou na migração do schema. Sugestão: manter o nomestatusno model ou fazer umRenameFieldvia migration e atualizar todos os acessos/serializers/queries para o novo nome de forma consistente.
estatus = models.IntegerField(
_("Reference status"),
choices=ReferenceStatus.choices,
blank=True,
default=ReferenceStatus.NO_REFERENCE
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| new_reference = Reference.objects.create( | ||
| mixed_citation=linea, | ||
| status=1, # Estatus predeterminado | ||
| estatus=1, # Estatus predeterminado |
There was a problem hiding this comment.
Aqui foi trocado para estatus=1, mas o restante do app ainda usa status (e as migrações criam status). Isso vai falhar ao criar o objeto via Wagtail. Sugestão: usar status=ReferenceStatus.CREATING (ou equivalente) ou alinhar a renomeação do campo com migration + atualização completa do código.
| estatus=1, # Estatus predeterminado | |
| status=1, # Estatus predeterminado |
O que esse PR faz?
Esse PR realiza ajustes de instalação e execução do markapi.
Onde a revisão poderia começar?
Sugiro que para realizar a revisão seja realizado os seguintes passos:
Como este poderia ser testado manualmente?
Para realizar os testes de marcação de um documento manualmente é necessário seguir os seguintes passos: https://github.com/scieloorg/markapi/wiki/Converter-DOCX-para-XML, considere o documento e14790.docx na fixtures que está sem erro de execução.
Algum cenário de contexto que queira dar?
Removi a duplicidade de containers e tornei o modelo llama para marcação obrigatório.
Screenshots
Segue algumas telas de documentos marcados:
Quais são tickets relevantes?
Não existe tíquete para essa atividade.
Referências
N/A